empirical risk minimization について

Words near each other

・ "O" Is for Outlaw
・ "O"-Jung.Ban.Hap.
・ "Ode-to-Napoleon" hexachord
・ "Oh Yeah!" Live
・ "Our Contemporary" regional art exhibition (Leningrad, 1975)
・ "P" Is for Peril
・ "Pimpernel" Smith
・ "Polish death camp" controversy
・ "Pro knigi" ("About books")
・ "Prosopa" Greek Television Awards
・ "Pussy Cats" Starring the Walkmen
・ "Q" Is for Quarry
・ "R" Is for Ricochet
・ "R" The King (2016 film)
・ "Rags" Ragland
・ ! (album)
・ ! (disambiguation)
・ !!
・ !!!
・ !!! (album)
・ !!Destroy-Oh-Boy!!
・ !Action Pact!
・ !Arriba! La Pachanga
・ !Hero
・ !Hero (album)
・ !Kung language
・ !Oka Tokat
・ !PAUS3
・ !T.O.O.H.!
・ !Women Art Revolution

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

empirical risk minimization ：ウィキペディア英語版

empirical risk minimization

Empirical risk minimization (ERM) is a principle in statistical learning theory which defines a family of learning algorithms and is used to give theoretical bounds on the performance of learning algorithms.
== Background ==
Consider the following situation, which is a general setting of many supervised learning problems. We have two spaces of objects

X

and

Y

and would like to learn a function

\! h: X \to Y

(often called ''hypothesis'') which outputs an object

y \in Y

, given

x \in X

. To do so, we have at our disposal a ''training set'' of a few examples

\! (x_1, y_1), \ldots, (x_m, y_m)

where

x_i \in X

is an input and

y_i \in Y

is the corresponding response that we wish to get from

\! h(x_i)

.
To put it more formally, we assume that there is a joint probability distribution

P(x, y)

over

X

and

Y

, and that the training set consists of

m

instances

\! (x_1, y_1), \ldots, (x_m, y_m)

drawn i.i.d. from

P(x, y)

. Note that the assumption of a joint probability distribution allows us to model uncertainty in predictions (e.g. from noise in data) because

y

is not a deterministic function of

x

, but rather a random variable with conditional distribution

P(y | x)

for a fixed

x

.
We also assume that we are given a non-negative real-valued loss function

L(\hat, y)

which measures how different the prediction

\hat

of a hypothesis is from the true outcome

y

. The risk associated with hypothesis

h(x)

is then defined as the expectation of the loss function:
:

) = \int L(h(x), y)\,dP(x, y).

A loss function commonly used in theory is the 0-1 loss function:

L(\hat, y) = I(\hat \ne y)

, where

I(...)

is the indicator notation.
The ultimate goal of a learning algorithm is to find a hypothesis

h^*

among a fixed class of functions

\mathcal

for which the risk

R(h)

is minimal:
:

h^* = \arg \min_{h \in \mathcal{H}} R(h).

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「empirical risk minimization」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース